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ABSTRACT 

We present the first edition of a catalog of variable stars from OGLE-II Galac- 
tic Bulge data covering 3 years: 1997-1999. Typically 200-300 / band data points 
are available in 49 fields between —11 and 11 degrees in galactic longitude, total- 
ing roughly 11 square degrees in sky coverage. Photometry was obtained using 
the Difference Image Analysis (DIA) software and tied to the OGLE data base 
with the DoPhot package. The present version of the catalog comprises 221,801 
light curves. In this preliminary work the level of contamination by spurious 
detections is still about 10%. Parts of the catalog have only crude calibration, 
insufficient for distance determinations. The next, fully calibrated, edition will 
include the data collected in year 2000. The data is accessible via FTP. Due to 
the data volume, we also distribute DAT tapes upon request. 



*Based on observations obtained with the 1.3-m Warsaw telescope at Las Campanas Observatory of the 
Carnegie Institution of Washington 



1. Introduction 



The main goal of the Optical Gravitational Lensing Experiment (OGLE, Udalski, Ku- 
biak & Szymahski 1997) is to search for microlensing events. Observationally, these events 
are basically a rare type of an optical variable, therefore it came as no surprise that after 
several years microlensing experiments have an exceptional record of variability in terms of 
the number of objects and epochs. To maximize event rates, microlensing searches focus on 
monitoring of very crowded, and scientifically attractive, stellar fields; the Galactic Bulge 
region and Magellanic Clouds. Some observations are conducted in denser portions of the 
Galactic disk. 

It is a common situation nowadays that the ability to generate data far exceeds the 
ability to process it, and even more so, to comprehend it. The list of projects which 
aim at monitoring significant parts of the sky for variability includes more than 30 names 
{http: //www. astro. princeton.edu/faculty /bp. html), yet only a small fraction of those can pro- 
cess the data efficiently enough to make the measurements publicly available soon after the 
data is taken (e.g. Brunner et al. 2001). The issue of exporting the data in a convenient 
form compounds the problem. The National Virtual Observatory (NVO) project has very 
ambitious plans to provide the tools and some standards (perhaps de facto standards) for 
processing the large amounts of information and web data publication (Szalay 2001). Large 
catalogs have added complexity (project description http://www.us-vo.org/). By the time 
some sort of processing is complete, new information emerges in the process, frequently in- 
formation which should be incorporated into the catalog. It seems that the only static layer 
is the raw data itself, typically CCD images, however the photometric output from number 
crunchers should also be reasonably slow to change with the new developments. 

A regular practice in OGLE is to release the data in the public domain as soon as 
possible. The most significant contributions are: BVI maps of dense stellar regions (Udalski 
et al. 1998b, 2000a), Cepheids in Magellanic Clouds (Udalski et al. 1999a, 1999b), eclipsing 
variables in the SMC (Udalski et al. 1998a), catalogs of microlensing events (Udalski et al. 
2000b, Wozniak et al. 2001). Examples from other microlensing teams include samples of 
MACHO microlensing events (Alcock et al. 1997a, 1997c) and selected variable star work 
from MACHO (1997b) and EROS (Afonso et al. 1999). In real time detection of microlensing 
events the main benefit comes from the follow-up work (e.g. Sackett 2000), in practice only 
possible with immediate pubhcation on the WWW. Therefore, all major microlensing teams 
(OGLE, EROS, MACHO and MOA) have, or had, active alert systems. 

A recent contribution to the pubhcly available data on variable stars is a WWW interface 
to the MACHO database (AUsman & Axelrod 2001), which started with somewhat hmited 
features, but has plans for expansion. Similar ideas of making evolving catalogs have been 
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discussed within OGLE for some time now and are motivated by the challenges of data 
processing/accessibility. The main objective here is not to make a potential broad user wait 
for a long time until the team makes the final refined product. There is a lot of potential use 
from the data at all levels of processing, as demonstrated by the serendipitous recovery of 
high proper motion stars (Eyer & Wozniak 2001) and discovery of the longest microlensing 
event ever observed, most likely caused by a black hole with the mass of several solar masses 
(Mao et al. 2002), both found in prehminary OGLE catalogs. OGLE has just released an 
online catalog of ~70,000 candidate variables in the LMC and SMC (Zebrun, Soszynski et 
al. 2001). With this paper we release an initial catalog of 221,801 candidate variables in the 
Galactic Bulge from Difference Image Analysis of OGLE-II data from seasons 1997-1999. 
Parts of the current edition are still not fully calibrated and should not be used in distance 
estimates (Section 3). 

We restate the basic information about the data in Section 2 and in Section 3 we briefly 
summarize the process of finding variability. Section 4 gives the details of how the catalog 
is structured, followed by final remarks and future plans in Section 5. 

2. Data 

All OGLE-II frames were collected with the 1.3 m Warsaw Telescope at the Las Cam- 
panas Observatory, Chile. The observatory is operated by the Carnegie Institution of Wash- 
ington. The "first generation" OGLE camera uses a SITe 2048 x 2049 CCD detector with 
24/im pixels resulting in 0.417" pixel^^ scale. Images of the Galactic bulge are taken in 
drift-scan mode at "medium" readout speed with a gain of 7.1 e^/ADU and readout noise 
of 6.3 e^. The saturation level is about 55,000 ADU. For details of the instrumental setup, 
we refer the reader to Udalski, Kubiak & Szymahski (1997). 

The majority of frames were taken in the / photometric band. The effective exposure 
time is 87 seconds. During observing seasons of 1997-1999 the OGLE experiment typically 
collected between 200 and 300 /-band frames for each of the 49 bulge fields BUL_SCl-49. 
OGLE-II images are 2kx8k strips, corresponding to 14' x 57' in the sky, therefore the total 
area of the bulge covered is about 11 square degrees. The number of frames in V and B 
bands is small and we do not analyze them with the DIA method. The median seeing 
is 1.3" for our data set. In Table 1 we provide equatorial and galactic coordinates of the 
field centers, the total number of analyzed frames and the number of candidate variables 
detected. Figure 1 schematically shows locations of the OGLE-II bulge fields with respect 
to the Galactic bar. Fields BUL_SC45 and BUL_SC46 were observed much less frequently, 
mostly with the purpose of maintaining phases of variable stars discovered by OGLE-I. 
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Observations of fields BUL_SC47-49 started in 1998; tfie season of 1997 is not available for 
them. 



3. Extracting variability from OGLE-II bulge frames using DIA 

The DIA data pipehne we used is based on the image subtraction algorithm developed 
by Alard & Lupton (1998) and Alard (2000), and was written by Wozniak (2000). Processing 
of a large 2kx8k pixel frame is performed after dividing it into 512x128 pixel subframes, 
with 14 pixel margin to ensure smooth transitions between coordinate transformations and 
fits to spatially variable PSF for individual pieces. Small subframe size allows us to use 
polynomial fits for drift-scan images, in which PSF shape and local coordinate system vary 
on scales of 100-200 pixels. The reference image, subtracted from all images of any given 
field, is a stack of 20 best images in the sequence. 

We adopted kernel expansion used by Wozniak (2000), generally applicable to all OGLE- 
11 data. The kernel model, represented by a 15x15 pixel raster, consists of 3 Gaussians with 
sigmas 0.78, 1.35, and 2.34 pixels, multiplied by polynomials of orders 4, 3, and 2 respectively 
The pipeline delivers a list of candidate variable objects and their difference light curves. 
The initial filtering is very weak, with only a minimum of assumptions made about the 
variabihty type. Candidate variables are flagged as "transient" or "continuous" variables 
depending on whether variability is conflned to episodes in an otherwise quiet object, or 
spread throughout the observed time interval. The total number of candidate variables in 
all 49 fields was slightly over 220,000, including 150,000 "continuous" and 66,000 "transient" 
cases. Only 4600 objects passed both filters, confirming sensible definitions of classes. The 
number of detected variable objects in a given field depends on the number density of stars, 
extinction, and number of available measurements. This ranged from about 800 to over 9000 
per field. The photometry files distributed with this publication do not contain some of the 
auxiliary information provided by the pipeline. In binary format they amount to slightly 
over 1.3 GB. Reference images take additional 1.6 GB of storage. We compress the data 
when possible. 

The error distribution in measurements from our DIA pipeline is nearly Gaussian with 
the average scatter only 17% above the Poisson limit for faint stars near 7=17-19 mag, 
gradually increasing for brighter stars, and reaching 2.5 times photon noise at 7 ~ 11 mag 
(about 0.5% of the total flux). Error bars adopted in this paper are photon noise estimates 
renormalized using the curve of Wozniak (2000). As the method effectively monitors all 
pixels, variable objects may be discovered even where no object is detected in the reference 
image. In regular searches monitoring is conducted only for objects detected in a single good 
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quality image, a template. This issue is related to centroid finding. Currently in our DIA 
pipeline centroids are calculated based on the variable signal in a number of frames. As a 
result the centroid will be poorly known for an object with low S/N variability, even if it is 
very bright on the reference image. Ideally one would use both pieces of information. It is 
usually obvious how to determine the centroid in the presence of blending when confronted 
with one particular object of interest, but an optimal algorithm for extracting all variability 
in the field using DIA is yet to be developed. 

Difference fluxes were converted to magnitudes using reference flux values obtained 
from DoPhot photometry on reference frames. The process of matching units was identical 
to that in Wozniak (2000). DIA observations were tied to the OGLE database of regular 
PSF photometry. Most fields were cahbrated to 0.05 mag accuracy, however at the time 
of this analysis for 10 fields (BUL_SC: 7, 9, 20, 25, 28, 32, 43, 47, 48 and 49) only rough 
cahbration was available and the zero point differences may reach ±0.25 mag. The catalog 
will be re-cahbrated after merging with the data for the 2000 observing season. 

The conversion is given by the formula 

mi =mo- 2.5 log(/ + /ref), 

where / is the difference flux of a particular observation, fj-^t is the reference flux, mo is the 
magnitude zero point for a given subframe of the reference image, and mj is the converted 
/-band magnitude. All quantities in the formula are available in the catalog and light curve 
flies (Section 4). Due to noise, occasionally one nms into a problem of negative fluxes in 
DIA. For those measurements the difference ffux may still be perfectly valid, the magnitude 
will have an error code and the appropriate flag will be set (Section 4 and Appendices). 

Julian dates of individual observations also bear some discussion. In drift-scan observing 
the time of mid exposure depends on the position of the object. In the case of the Galactic 
Bulge data of OGLE-II, the correction which should be added to the starting time of the 
scan is given by: 

dt^{Y + 1024) X 0.0423816/86400 HJD < 2451040 
dt^{Y + 1024) X 0.0484627/86400 HJD > 2451040, 

where Y is the pixel position in the reference image along the direction of the scan, in the 
range 0-8192 in OGLE-II, resulting in differences reaching several minutes. The time stamps 
of observations in the catalog have been corrected for this effect (Section 4). 
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4. FTP catalog 



The catalog of candidate bulge variables presented in this paper is available via FTP 
from ftp://bulge.princeton.edu/ogle/ogle2/bulge_dia_variables. The data is naturally divided 
into 49 parts for fields BUL_SC1 - BUL_SC49. Reference images for each field (2k x 8k 
FITS frames) are stored in the subdirectory ref erence_f rames. Information is available in 
two formats: plain text (subdirectory plain_text) and binary FITS tables (subdirectory 
f its_tables). FITS format is an astronomical standard and the ease of its use with pro- 
grams like IDL is remarkable. There are two types of files for each field: the catalog of 
candidate variables, and the database of light curves. The catalog contains a single entry 
per object with the overall parameters of the light curve and identifying information, like 
coordinates. Below is a sample record with the explanation of fields: 

3764 207.14 6721.60 271.272315 -28.578460 18:05:05.35 -28:34:42.5 

17.407 0.783 359.5 23.38 0.13 1 152 181 181 1 

1. [ ] — number of candidate variable as returned by the pipeline 

2. [X_TPL] — X template coordinate (0.0 is the middle of the bottom left pixel) 

3. [Y_TPL] — y template coordinate (0.0 is the middle of the bottom left pixel) 

4. [RA] — RA in decimal degrees 

5. [DEC] — DEC in decimal degrees 

6. [RA_STR] — RA in sexagesimal hours 

7. [DEC_STR] — DEC in sexagesimal degrees 

8. [MEAN_MAG] — mean of all magnitude values which could be determined 

9. [MAG_SCAT] — scatter of all magnitudes used in mean calculation 

10. [REF_FLUX] — reference flux 

11. [MAG_0] — magnitude zero point 

12. [ID_RAD] — distance between the centroid of the variable and the nearest DoPhot star 
in the reference image in pixels (for the calculation of the reference flux and conversion 
to magnitudes) 
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13. [VTYPE] — type of variable coded in bits of a 2 byte integer: 1st bit - "transient" , 2nd 
bit - "continuous" . Therefore the vahie of the integer will be 1 for "transient" , 2 for 
"continuous variable " , and 3 for both (see Section 3 for details) . 

14. [N_FRAMES] — number of frames used in centroid determination 

15. [N_BAD] — number of bad pixels in the fitting radius on the reference image 

16. [NGOOD] — number of "good" flux measurements. A "good" point is the one for which 
none of the several types of problems monitored by the pipehne occurred (flags 1-10 
in Appendix B are set to 0). 

17. [NMAG] — number of magnitude values which could be determined ( the ones which are 
not determined come from non-positive fluxes) 

18. [flag] — flags, see the explanation below 

Several kinds of problematic situations are reported as flags in the last column of the 
catalog file. Flags are explained in Appendix A. 

Capitalized names after the column number are names of columns in binary FITS tables 
(bul_sc*_cat .fts). An empty bracket means that this column is omitted in the FITS table, 
but it is present in the text file. In text version of these files (bul_sc*_cat . dat) columns 
have no names and are identified by their order. The database of light curves includes all 
measurements for all detected objects. Light curves in plain text format are stored one per 
file and grouped by the fleld. For example, subdirectory BUL_SC1 in plain_text contains 
4597 bul_scl_*.dat.gz flies, compressed to save space and transfer time. The columns in 
light curve flies are as follows: 

1. [OBS_TIME] — Hehocentric Juhan Day of the observation, offset by 2450000.0 

2. [DIFF_FLUX] — difference flux (-99.00 for error code) 

3. [FLUX_ERR] — difference flux error (-99.00 for error code) 

4. [mag] — / band magnitude (—1.0 for error code) 

5. [MAG_ERR] — / band magnitude error (—1.0 for error code) 

6. [FLAG] — flags, explained below 
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The flags are explained in Appendix B. They provide a lot of information on whether the 
measurement is valid or not and common problems which may have affected its reliability. 
To stay on the conservative side, only measurements with no flags should be used (integer 
value 0). In plain_text subdirectory there are also 49 bul_sc*_db.tar files with all light 
curve files for a given field grouped together for convenient transfers. 

In binary FITS format all light curves for each field arc stored in a single table. The first 
extension contains the names of the frames and starting times of the drift-scan exposures 
in Heliocentric Julian Days shifted by 2450000.0. These time stamps are identical for all 
objects in a single image. However, the cfTcctivc time of mid exposure varies depending on 
the position of the object along the scan. The corrected times of observations arc provided 
for each star separately in the second FITS extension with the actual photometry (Section 3). 
All measurements for all stars are stored in the same columns and identified by their index 
within the column. The number of observations per star is fixed and given by the length of 
the time vector from the first extension. Ordering is such that the number of the individual 
observation within a single light curve is ascending fastest along the column of the binary 
table. If, for example, the number of dates in the first extension of bul_scl_db . f ts is 197, 
the first 197 rows of the second FITS extension correspond to the first light curve, the next 
197 rows are the second light curve and so on. The total number of rows is 197 x 4597, 
where 4597 is the number of candidate variables in BUL_SC1 field, the same as the number of 
rows in catalog files bul_scl_cat.dat and bul_scl_cat . fts. This information, along with 
several other useful numbers, is stored in headers. 

In Appendix C we include the explanation and values of the pipeline parameters which 
were important for detection of variables. 

5. Discussion and future work 

As mentioned before, the current edition of the catalog basically includes entire output 
of the DIA pipeline as described by Wozniak (2000), supplemented with determinations 
of the reference fiux to put the light curves on the magnitude scale. Some optimization 
has been performed to keep the contamination by artifacts low without rejecting too many 
real variable objects, but it must be clearly stated that about 10% of the hght curves in 
the present release are not real objects and result from various problems, undetected at 
the pipeline level. Wc are extending the work of Mizerski & Bejger (2001) from the first 
BUL_SC1 field to all fields in the effort to flag several common types of artifacts and clean 
the sample of spurious objects. 
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Classifying a real variable star versus a spurious one is the first step in the interpretation 
of this data. Ultimately we envision increasingly refined information added to the catalog to 
facilitate applications. This should include the full classification of the detected variables, 
cross identification with objects found by 2MASS and in X-ray catalogs, periods for periodic 
sources etc. The work on automated classification of periodic variables is in progress. In 
addition to examining 2-D projections of a multidimensional parameter space and trying to 
code a human made algorithm (see Mizerski & Bejger 2001 and Wozniak et al. 2001 for 
such work on this data), we are experimenting with data mining techniques. Even with the 
current volume of data in OGLE-II we believe it is enabling to make the transition from 
"telling the computer how to do it" to "telling the computer what to do" and leaving the 
rest to the algorithm. A number of standard machine learning tools are available, which take 
small preclassified subsets of light curves and can " learn" to classify the rest of the data. 

Zebrun, Soszyhski et al. (2001) provided a convenient web interface to access the data 
on variables in the LMC and SMC. It is our intention to build a similar tool with the 
addition of positional searches. Although the volume of the data which can be accessed 
by browsing web pages is limited in practice, the search by coordinates is a powerful tool 
for numerous applications. Transfer by FTP and distribution of DAT tapes are currently 
primary modes of accessing major parts of this catalog. For your copy of a DAT tape, please 
contact Prof. Bohdan Paczyhski (email: bp@astro.princeton.edu, mail: Princeton University 
Observatory, Princeton, NJ, 08544). To access this archive online use the OGLE web site 
http://bulge.princeton.edu/'^ogle/ogle2/bulge-dia-variables . 

We thank Prof. Paczyhski for support and encouragement in this project. This work 
was supported by the NSF grant AST-9820314 to B. Paczyhski, and Pohsh KBN grant 
2P03D01418 to M. Kubiak. Additional support for P. Wozniak was provided under the 
DOE contract W-7405-ENG-36. 



9 



Table 1. OGLE-II bulge fields. 
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Table 1 — Continued 
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Fig. 1. — OGLE bulge fields in galactic coordinates (gnomonic projection, great circles are 
mapped to straight lines). Green strips are the OGLE-II scans and blue squares are the old 
OGLE-I fields. Large oval indicates the location of the Galactic bar. Fields are selected in 
windows of low extinction and avoid very bright foreground stars. 
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6. APPENDIX A 



Catalog flags arc coded as single bits of a 4 byte integer and listed below (the least 
signiflcant bit flrst). Integer value 12, e.g., means that flags 3 and 4 are true and the rest 
are false. The values quoted for selected pipeline parameters have been actually used to set 
the flags in this analysis. 



1. crowding flag, set if within ±4 pixels of the maximum pixel with flux /q there is a 
secondary local maximum with pixel flux f > 0.15 x fg x r, where r is the distance 
from the star centroid in pixels 

2. fewer than N_FRAMES = 4 used in centroid flnding 

3. more than N_BAD = bad pixels on the reference image within the fltting radius of 3.0 
pixels 

4. fraction of less than MIN_GFRA = 0.5 difference flux measurements are "good" from the 
total number of frames taken for the fleld. A "good" point is the one for which none 
of the several types of problems monitored by the pipeline occurred (flags 1-10 in 
Appendix B are set to 0) . 

5. mean magnitude and its scatter could not be calculated because fewer than 2 individual 
magnitudes were deflned 



15 



7. APPENDIX B 

Light curve flags are coded as single bits of a 4 byte integer and listed below (the least 
significant bit first). Integer value 12, e.g., means that flags 3 and 4 are true and the rest 
are false. The values quoted for selected pipeline parameters have been actually used to set 
the flags in this analysis. 

1. pipeline returned error code for difference flux 

2. pipeline returned error code for flux error 

3. per pixel of the difference subframe larger than MAXCHI2I = 6.0 

4. per pixel of the PSF fit larger than MAXCHI2N = 1.0e32 (effectively no cut) 

5. FWHM of the PSF fit larger than MAX_FWHM = 3.4 pix 

6. number of bad pixels within the fitting radius larger than MAX_NBAD = 3 

7. correlation coefficient with the PSF lower than MIN_CORR = 0.0 (effectively no cut) 

8. star in the rejected region of the CCD (currently empty) 

9. flux error NSIGERR = 10 times larger than percentile ERRFRAC = 0.5 of all individual 
flux errors (0.5 corresponds to median) 

10. per pixel of the PSF fit NSIGCHI2 = 10 times larger than percentile CHI2FRAC = 0.5 
of all individual values (0.5 corresponds to median) 

11. magnitude could not be calculated due to missing or non-positive fluxes 

12. magnitude error could not be calculated due to missing or non-positive values 
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8. APPENDIX C 

Explanation of light curve cleaning parameters. The values quoted have been used to 
set the flags in Appendices A and B. 



1. 


[MAXNMADO 


= 0] — max number of bad pixels on the reference image within the fitting 




radius 




2. 


[MINNFRMO 


= 4] — min number of frames used in centroid calculation 


3. 


[MAX_NBAD 


= 3] — max number of bad pixels on a given image within the fitting radius 


4. 


[MIN_GFRA 

L 


0.5] — min fraction of good points within entire sequence of frames 


5. 


fn A T\ TT TTV 

[BAD_FLUX 


= —99.0] — error code for difference flux 


6. 


[BAD_ERR = 


= —99.0] — error code for flux error 


7. 


[MAXCHI2N 


= 1.0e32] — max ^ per pixel for PSF fit 


8. 


[MAXCHI2I 


= 6.0] — max per pixel for difference subframe 


9. 


[MIN_CORR 


= O.O] — min correlation coefficient with the PSF 


10. 


[MAX_FWHM 


= 3.4] — max FWHM in pixels 


11. 


[ERRCLN = 


l] — is flagging of large error bars on ? (1 = yes) 


12. 


[NSIGERR = 


= 10.0] — base threshold is multiplied by this factor to get the final threshold 




for error bar 


13. 


[ERRFRAC = 


= 0.5] — percentile of the error distribution for base threshold 


14. 


[CHI2CLN = 


= l] — is flagging poor PSF fits on ? (1 = yes ) 


15. 


[NSIGCHI2 


= 10.0] — base threshold is multiplied by this factor to get the final thresh- 




old for ' 


of the PSF fit 


16. 


[CHI2FRAC 


— 0.5] — percentile of the per pix distribution for base threshold 



17 



